Incremental construction and maintenance of morphological analysers based on augmented letter transducers
نویسندگان
چکیده
We define deterministic augmented letter transducers (DALTs), a class of finitestate transducers which provide an efficient way of implementing morphological analysers which tokenize their input (i.e., divide texts in tokens or words) as they analyse it, and show how these morphological analysers may be maintained (i.e., how surface form–lexical form transductions may be added or removed from them) while keeping them minimal; efficient algorithms for both operations are given in detail. The algorithms may also be applied to the incremental construction and maintentance of other lexical modules in a machine translation system such as the lexical transfer module or the morphological generator.
منابع مشابه
Augmented Reality System and Maintenance of Oil Pumps
Qualification of employees who operate technological processes directly influences the safety of production. However, the employees’’ qualification cannot completely exclude human factor. Today, there are many technologies that can minimize or eliminate human factor impact on production safety ensuring. The augmented reality technology is an example of this technology. Nowadays, the augmented r...
متن کاملArabic Diacritization Using Weighted Finite-State Transducers
Arabic is usually written without short vowels and additional diacritics, which are nevertheless important for several applications. We present a novel algorithm for restoring these symbols, using a cascade of probabilistic finitestate transducers trained on the Arabic treebank, integrating a word-based language model, a letter-based language model, and an extremely simple morphological model. ...
متن کاملDeveloping Morphological Analysers for South Asian Languages: Experimenting with the Hindi and Gujarati Languages
A considerable amount of work has been put into development of stemmers and morphological analysers. The majority of these approaches use hand-crafted suffix-replacement rules but a few try to discover such rules from corpora. While most of the approaches remove or replace suffixes, there are examples of derivational stemmers which are based on prefixes as well. In this paper we present a rule-...
متن کاملRFID-based decision support within maintenance management of urban tunnel systems
Efficiently, tracking information related to components, materials and equipment from the production/construction phase to operation and maintenance is a challenge in the industries. The industry environment is a natural fit for generating and utilizing instance-level data for decision support. Advanced electronic identification and data storage technologies e.g. radio frequency identification ...
متن کاملRFID-based decision support within maintenance management of urban tunnel systems
Efficiently, tracking information related to components, materials and equipment from the production/construction phase to operation and maintenance is a challenge in the industries. The industry environment is a natural fit for generating and utilizing instance-level data for decision support. Advanced electronic identification and data storage technologies e.g. radio frequency identification ...
متن کامل